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Abstract This article presents an antomatic malfnnction detection framework 
based on data mining approach to analysis of network event seqnences. The con¬ 
sidered environment is Long Term Evolution (LTE) for Universal Mobile Telecom- 
mnnications System (UMTS) with sleeping cell caused by random access channel 
failure. Sleeping cell problem means unavailability of network service without trig¬ 
gered alarm. The proposed detection framework uses N-gram analysis for iden¬ 
tification of abnormal behavior in sequences of network events. These events are 
collected with Minimization of Drive Tests (MDT) functionality standardized in 
LTE. Further processing applies dimensionality reduction, anomaly detection with 
K-Nearest Neighbors (K-NN), cross-validation, post-processing techniques and ef¬ 
ficiency evaluation. Different anomaly detection approaches proposed in this paper 
are compared against each other with both classic data mining metrics, such as 
F-score and Receiver Operating Characteristic (ROC) curves, and a newly pro¬ 
posed heuristic approach. Achieved results demonstrate that the suggested method 
can be used in modern performance monitoring systems for reliable, timely and 
automatic detection of random access channel sleeping cells. 

Keywords Data mining • sleeping cell problem • anomaly detection • performance 
monitoring • self-healing • LTE networks 


1 Introduction 

Modern cellular mobile networks are becoming increasingly diverse and complex, 
due to coexistence of multiple Radio Access Technologys (RATs), and their cor- 
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responding releases. Additionally, small cells are actively deployed to complement 
the macro layer coverage, and this trend will only grow. In the future this situation 
is going to evolve towards even higher complexity, as in 5G networks there will 
be much more end-user devices, served by different technologies, and connected 
to cells of different types. New applications and user behavior patterns are daily 
coming into play. In such environment network performance and robustness are be¬ 
coming critical values for mobile operators. In order to achieve these goals, efficient 
flow of Quality and Performance Management (QPM) |34| . which is a sequence 
of fault detection, diagnosis and healing, should be developed and applied in the 
network in addition to other optimization functions. 


Concept of Self-Organizing Network (SON) [52l [53] has been proposed to 
automate and optimize the most tedious manual tasks in mobile networks, in¬ 
cluding QPM. Automation is the key idea in SON and it has been proposed for 
self-configuration, self-optimization and self-healing in LTE and UMTS networks 
Eicni I60| . In traditional systems detection, diagnosis and recovery of network 
failures is mostly manual task, and it is heavily based on pre-defined thresholds, 
aggregation and averaging of large amounts of performance data - so called Key 
Performance Indicators (KPIs). Self-healing [59] . m automates the functions of 
QPM process to improve reliability of network operation. Though, self-healing is 
still among the least studied functions of SON at the moment, and the developed 
solutions and use cases require improvement prior to application in the real net¬ 
works. This is especially important for non-trivial network failures such as sleeping 
cell problem This is a special term used to denote a breakdown, which 

causes partial or complete degradation of network performance, and which is hard 
to detect with conventional QPM within reasonable time. Thus, in the research and 
standardization community automatic fault detection and diagnosis functions, en¬ 
hanced with the most recent advancements in data analysis, are seen as the future 
of self-healing. Thus, development of improved self-healing functions for detection 
of sleeping cell problems, through application of anomaly detection techniques is 
of high importance nowadays. This article presents a novel framework based on 
N-gram analysis of MDT event sequences for detection of random access channel 
sleeping cells. 


The rest of this paper is organized as follows. Section[^describes common prac¬ 
tices of quality and performance management in mobile networks, including MDT 
functionality, and advanced methods based on knowledge mining algorithms. Sec¬ 
tion defines the concept of sleeping cell and its possible root cause failures. In 
Section 1^ simulation environment, assumptions and random access channel prob¬ 
lem are presented. Also Section describes the generated and analyzed perfor¬ 
mance MDT data. Section concentrates on the suggested sleeping cell detection 
knowledge mining framework. It includes overview of the applied anomaly detec¬ 
tion methods: K-NN anomaly outlier scores, N-gram, minor component analyses, 
post-processing and data mining performance evaluation techniques. Section is 
devoted to the actual research results. Data structures at different stages of analy¬ 
sis are shown, and efficiency of different post-processing methods is compared. In 
Section the concluding remarks regarding the hndings of the presented research 
are given. 
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2 Quality and Performance Management in Cellular Mobile Networks 

Performance management in wireless networks inclndes three main components: 
data collection, analysis and results interpretation. Data gathering can be done 
either by aggregation of cell-level statistics - collection of KPIs, or collection of 
detailed performance data with drive tests. The main weaknesses in analysis of 
KPIs are that a lot of statistics is left ont at the aggregation stage, due to averaging 
over time, element and because fixed threshold values are applied. Even thought 
drive test campaigns provide far more elaborate information regarding network 
performance, they are expensive to carry out and do not cover overall area of 
network operation. Root cause analysis is done manually in majority of cases, 
and because of that there is a room for more intelligent approaches to detection 
and diagnosis of network failures, e.g. with data mining and anomaly detection 
techniques. This would provide possibility to automate performance monitoring 
task furthermore. 


2.1 Minimization of Drive Tests 

Yet another way to improve network QPM is to collect a detailed performance 
database. This is enabled with MDT functionality standardized in 3’"'^ Generation 
Partnership Programme (3GPP) |28| . MDT is designed for automatic collection 
and reporting of user measurements, where possible complemented with location 
information. Collected data is then reported to the serving cell, which in turn 
sends it to MDT server [36] . Thus, large amount of network and user performance 
is available for analysis. This is where the power of data mining and anomaly 
detection can be applied. 

Specification describes several use cases for MDT: improvement of network 
coverage, capacity, mobility robustness and end user quality of service jUj. Ac¬ 
cording to the standard, MDT measurements and reporting can be done both in 
idle and connected Radio Resource Control (RRC) modes. In logged MDT, User 
Equipment (UE) stores measurements in memory, and reporting is done at the 
next transition from idle to connected state. In immediate MDT, measurements 
are reported as soon as they are done through existing connection. In turn, there 
are two measurement modes in immediate MDT: periodic and event-triggered |36| . 
Periodic measurements are very useful for initial network deployment coverage and 
capacity verification as they provide detailed map of network performance, say in 
terms of signal propagation or throughput. The main disadvantage of periodic 
measurements is that they consume a lot of network and user resources. In con¬ 
trast, event-triggered approach provides less information regarding the network 
status, but can be very efficient for mobility robustness and resource savings. In 
our study, immediate event-triggered MDT is used for collection of performance 
database. Table presents the list of network events which triggered MDT mea¬ 
surements and reporting. 

2.1.1 Location Estimation in MDT 

One of the important features of MDT is collection of geo-location information 
at the measurement time moments. Whenever UE location is provided in MDT 
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Table 1 Network events triggering MDT measurements and reporting 


PL PROBLEM - Physical Laver Problem 1301. 

RLF - Radio Link Failure | 61| . 

RLF REESTAB. - Connection reestablishment after RLF. 
A2 RSRP ENTER - RSRP goes under A2 enter threshold. 
A2 RSRP LEAVE - RSRP goes over A2 leave threshold. 
A2 RSRQ ENTER - RSRQ goes over A2 enter threshold. 
A3 RSRP - A3 event, according to 3GPP specification. 

HO COMMAND - handover command received m- 
HO COMPLETE - handover complete received 1611. 


DOMINANCE MAP 



X coordinate, m 

Fig. 1 Wrap around Macro 21 slow faded dominance map 

report there are several ways to associated it with particular cell, such as: serving 
cell ID, dominance maps and a new approach based on target cell ID information. 

Serving cell ID is available with MDT event-triggered report, even for early 
releases of LTE. However, in case of coverage hole or problems with new connec¬ 
tion establishment, this approach can lead to mistakes in UE location association, 
because the faulty cell would never become serving in the worst case scenario. 
This limits the usage of serving cell method for sleeping cell detection. To over¬ 
come the problem presented above, a dominance maps method can be used. This 
is a map, which demonstrates the E-UTRAN NodeB (eNBfi] with dominating, 
i.e. strongest radio signal in each point of the network, see Fig. Creation of 
dominance map requires information about path loss and slow fading. The main 
advantage of dominance maps is that mapping of cell ID to location coordinate 
of UE MDT measurement is very precise, and this results in higher accuracy of 
sleeping cell detection. The downside dominance maps approach is that it requires 
a lot of detailed input measurement information. Though, MDT functionality is 
one of the ways to create such maps fast and relatively simple. Additionally, more 
accurate user location information is going to be available with deployment of 
newer releases of mobile networks |23| . 

The last method for cell ID and UE report location association uses target cell 
ID feature. The main advantage of this approach is that it does not require serving 


^ Evolved Universal Terrestrial Radio Access Network (E-UTRAN) 
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cell ID, user geo-positioning location or knowledge about network dominance areas. 
This eases the requirements for MDT data collection in amount of details regarding 
user location. The problem of mapping on the basis of target cell ID, is that it 
might be useful for detection of only particular types of network problem, such 
random access Sleeping Cell (SC). Efficiency of this method for detection of other 
malfunctions is subject for further verification. 

The key aspects which should be taken into account when selecting a location 
association method are accuracy and amount of information to create mapping 
between cell and user location. 


2.2 Advanced data analysis approaches in QPM 

Studies in advanced data analysis for QPM can be divided to several groups. In 
certain studies, the data reported by the users is used for the analysis. For in¬ 
stance, in [50] authors suggest a method for detection of sleeping cells, caused by 
transmitted signal strength problem, on the basis of neighbor cell list information. 
Application of non-trivial pre- processing and different classification algorithms 
allowed to achieve relatively good accuracy in detection of cell hardware faults. 
However, the proposed anomaly detection system is prone to have relatively high 
false rate. In a method based on analysis of TRACE-based user data with dif¬ 
fusion maps is presented. More extensive application of diffusion maps for network 
performance monitoring can also be found in |44| . 

Even though, user level statistics is more detailed, still majority of studies 
devoted to improvement of QPM rely on cell-level data. The first proposals of 
sleeping cell detection automation using statistical methods of network monitor¬ 
ing are presented in [HIISI. Preparation of normal cell load profile and evaluation 
of the deviation in observed cell behavior is suggested as a way for identification 
of problematic cells. The idea of statistical approach has been further studied in 
|55| . |62| . |54| . where a profile-based system for fault detection and diagnosis is 
proposed. Bayesian networks have also been applied for diagnosis and root cause 
probability estimation, given certain KPIs (Miillliii. The complications here 
are preparation of correct probability model and appropriate KPI threshold pa¬ 
rameters. More advanced data mining methods are applied to analysis of cell-level 
performance statistics, and novel ensemble methods of classification algorithms 
is proposed [Mill]- In |19l I20| application of classification and clustering meth¬ 
ods for detection and diagnosis of strangely behaving network regions is presented. 
Some studies also consider neural network algorithms for detection of malfunctions 

[STIllH]. 

The largest drawback of processing cell level data is that collection of appro¬ 
priate statistical base takes substantial amount of time, and can vary from days to 
months. This increases reaction time in case of outages and does not completely 
solve the problems of operators in optimization of their QPM. In order to over¬ 
come weaknesses of analysis based on cell KPIs, our studies are concentrated at 
the analysis of the user-level data, collected with immediate MDT functionality 
isiiig. In the early works cell outage detection caused by signal strength prob¬ 
lems (antenna gain failure) is studied |lll 1121 [6l] . This area matches the 3GPP use 
case called “cell outage detection” m- Identification of the cell, in malfunction 
condition is done by means of analysis of numerical properties of multidimensional 
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dataset. Each data point represents either periodic or event-triggered user mea¬ 
surement. Snch methods as diffusion maps dimensionality reduction algorithm, 
k-means clustering and k-nearest neighbor classification methods are applied. 

To increase robustness of the proposed solutions in MDT data analysis and 
make the developed detection system suitable for application in real networks, 
a more sophisticated experimental setup is considered. Sleeping cell caused by 
malfunction of random access channel, discussed in Section does not produce 
coverage holes from perspective of radio signal, but still makes service unavailable 
to the subscribers. This problem is considered to be one of the most complex for 
mobile network operators, as detection of such failures may take days or even 
weeks, and negatively affects user experience |34| . To make fault detection frame¬ 
work more flexible and independent from user behavior, such as variable mobility 
and traffic variation, analysis of numerical characteristics of MDT data is substi¬ 
tuted with processing of network event sequences with N-gram method. Network 
events can include different mobility or signaling related nature, such as A2, A3 or 
handover complete message m- Initial results in this area are presented in m- 


3 Sleeping Cell Problem 

Sleeping cell is a special kind of cell service degradation. It means malfunction 
resulting in network performance decrease, invisible for a network operator, but 
affecting user Quality of Experience (QoE). On one hand, detection of sleeping 
cell problem with traditional monitoring systems is complicated, as in many cases 
KPI thresholds do not indicate the problem. On the other hand fault identification 
can be very sluggish, as creation of cell behavior profile requires long time, as it 
is discussed in the previous section. Regular, less sophisticated types of failures 
usually produce cell level alarms to performance monitoring system of mobile 
network operator. In contrast, for sleeping cells degradation occurs seamlessly and 
no direct notification is given to the service provider. 

In general, any cell can be called degraded in case if it is not 100% functional, 
i.e. its services are suffering in terms of quality, what in turn affects user experience. 
There are 3 distinguished extents of cell performance degradation Classification 
of sleeping cells, depending on the extent of performance degradation from the 
lightest, to the most severe m .|16|: impared or deteriorated - smallest negative 
impact on the provided service, crippled - characterized by a severely decreased 
capacity, and catatonic - kind of outage which leads to complete absence of service 
in the faulty area, such cell does not carry any traffic. 

Degradation can be caused by malfunction of different hardware or software 
components of the network. Depending on the failure type, different extent of 
performance degradation can be induced. In this study the considered sleeping 
cell problem is caused by Random Access Channel (RACK) failure. This kind 
of problem can appear due to RACK misconfiguration, excessive load or soft¬ 
ware/firmware problem at the eNB side [2], [65]. RACK malfunction leads to 
inability of the affected cell to serve any new users, while earlier connected UEs 
still get served, as pilot signals are transmitted. This problem can be classified to 
crippled sleeping cell type, and with time it tends to become catatonic. In many 
cases RACK problem becomes visible for the operator only after a long observa- 
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tion time or even due to user complains. For this reason, it is very important to 
timely detect such cells and apply recovery actions. 

3.0.1 Random Access Sleeping Cell 

Malfunction of RACK can lead to severe problems in network operation as it 
is used for connection establishment in the beginning of a call, during handover 
to another cell, connection re-establishment after handover failure or Radio Link 
Failure (RLF) [^. Malfunction of random access in cell with ID 1, is caused by 
erroneous behavior of T304 timer , which expires before random access proce¬ 
dure is finished. Thus, whenever UE tries to initiate random access to cell 1, this 
attempt fails. Malfunction area covers around 5 % of the overall network. 


4 Experimental Setup 

4.1 Simulation environment 

Experimental environment is dynamic system level simulator of LTE network, de¬ 
signed according to 3GPP Releases 8, 9, 10 and partly 11. Throughput, spectral ef¬ 
ficiency and mobility-related behavior of this simulator is validated against results 
from other simulators of several companies in 3GPP [USTlllS]. Step resolution of 
the simulator is one Orthogonal Frequency-Division Multiplexing (OFDM) sym¬ 
bol. Methodology for mapping link level SINR to the system level is presented in 
[8] . Simulation scenario is an improved 3GPP macro case 1 [29] with wrap-around 
layout, 21 cells (7 base stations with 3-sector antennas), and inter-site distance of 
500 meters. Modeling of propagation and radio link conditions includes slow and 
fast fading. Users are spread randomly around the network, so that on average 
there are 15 dynamically moving UEs per cell. The main configuration parameters 
of the simulated network are shown in Table O 


4.2 Generated Performance Data 

Generated performance data includes dominance map information and MDT log, 
which contains the following fields: 

— MDT triggering event ID. The list of possible events is presented in Table 
This is a categorical (nominal) and sequential data, i.e. sequences of events are 
meaningful from data mining perspective; 

— UE ID. This is also categorical data; 

— UE location coordinates [m]. It is numerical, spatial data; 

— Serving and target cell ID - spatial, categorical data. 

It is important to know the type of the analyzed data to construct efficient knowl¬ 
edge mining framework |101135| . 

Simulations done for this study cover three types of network behavior: “nor¬ 
mal” - network operation without random access sleeping cell; “problematic” - 
network with RAGH failure in cell I; “reference” - no sleeping cell, but different 
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Table 2 General Simulation Configuration Parameters 


Parameter 

Value 

Parameter 

Value 

Cellular layout 

Macro 21 Wrap¬ 
around 

Number of cells 

21 

UEs per cell 

17 

Inter-Site Distance 

500 m 

Link direction 

Downlink 

RRC IDLE mode 

Disabled 

User distribution 

Uniform 

Maximum BS TX 

46 dBm 

in the network 


power 


Initial cell selection 

Strongest RSRP 

Handover margin 

3 dB 

criterion 

value 

(A3 margin) 


Handover time to 

256 ms 

Hybrid Adap- 

Enabled 

trigger 


tive Repeat and 
reQuest (HARQ) 


Slow fading stan- 

8 dB 

Slow fading resolu- 

5 m 

dard deviation 


tion 


Simulation length 

572 s ( 9.5 min) 

Simulation resolu- 

1 time step = 71.43 



tion 

flS 

Network syn- 

Asynchronous 

Max number of 

20 

chronicity mode 


UEs/cell 


UE velocity 

30 km/h 

Duration of calls 


Traffic model 

Constant Bit Rate 

Normal and Refer- 

Simulation without 


320 kbps 

ence cases 

sleeping cell 

Problematic case 

Simulation with 

RACK problem in 
cell 1 



A2 RSRP Thresh- 

-no 

A2 RSRP Hystere- 

3 

old 


sis 


A2 RSRQ Thresh- 

-10 

A2 RSRQ Hystere- 

2 

old 


sis 



slow and fast fading maps, i.e. if compared to “normal” case, propagation-wise it 
is a different network. The latter case is used for validation purposes. All three 
of these cases have different mobility random seeds, i.e. call start locations and 
UE traveling paths are not the same. Each of the cases are represented with 6 
data chunks. The training and testing phases of sleeping cell detection are done 
with pairs of MDT logs by means of K-fold approach [35] • For example, “normal”- 
“problematic”, or “normal”-“reference” cases are considered. Thus, in total there 
are 72 unique combinations of analyzed MDT log pairs, which is rather statistically 
reliable data base. 


5 Sleeping Cell Detection Framework 

The core of the presented study is sleeping cell detection framework based on 
knowledge mining, Fig.[^ Both training and testing phases are done in accordance 
to the process of Knowledge Discovery in Databases (KDD), which includes the 
following steps ESI, [35]: data cleaning, integration from different sources, feature 
selection and extraction, transformation, pattern recognition, pattern evaluation 
and knowledge presentation. The constructed data analysis framework for sleep- 
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Fig. 2 Sleeping Cell Detection Framework 


ing cell detection is semi-supervised, because unlabeled error-free data is used for 
training of the data mining algorithms. In testing phase problematic data is ana¬ 
lyzed to detect abnormal behavior. Reference data is used for testing in order to 
verify how much the designed framework is prone to make false alarms. 


5.1 Feature Selection and Extraction 

Feature selection and extraction is the first step of sleeping cell detection. At 
this stage, input data is prepared for further analysis. Pre-processing is needed as 
reported UEs MDT event sequences have variable lengths, depending on the user 
call duration, velocity, traffic distribution and network layout. 


5.1.1 Sliding Window Pre-processing 

Sliding window approach [56] allows to divide calls to sub-calls of constant length, 
an by that to unify input data. There are two parameters in sliding window algo¬ 
rithm: window size m and step n. After transformation, one sequence of N events 
(a call) is represented by several overlapping (in case if n < m) sequences of equal 
sizes, except for the last sub-call, which is the remainder from N modulo n. 

In the presented results overlapping sliding window size is 15, and the step is 
10 events. Such setup allows to maintain the context of the data after processing 
[44] . The number of calls and sub-calls for all three data sets are shown in Table 

El 
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Table 3 Number of calls and sub-calls in analyzed data 


Amount / Dataset 

Normal 

Problem 

Reference 

Calls (all) 

2530 

1940 

2540 

Sub-calls (all) 

7230 

7134 

7201 

Normal sub-calls 

6869 

5932 

6821 

Abnormal sub-calls 

361 

1202 

380 


Table 4 Example of A^-gram analysis per character, N = 2. 


Analyzed word 

pe 

er 

rf 

or 

rm 

ma 

me 

an 

nc 

ce 

performance 

1 

1 

1 

1 

1 

1 

0 

1 

1 

1 

performer 

1 

2 

1 

1 

1 

0 

1 

0 

0 

0 


5.1.2 N-Gram Analysis 

When input user-specific MDT log entries are standardized with sliding window 
method, the data is transformed from sequential to numeric format. It is done 
with N-gram analysis method , widely used e.g. for natural language processing 
and text analysis applications such as speech recognition, parsing, spelling, etc. 
[HEIllIlEslIlI]. In addition, N-gram is applied for whole-genome protein sequences 
|26| and for computer virus detection [IZIISI. 

N-gram is a sub-sequence of N overlapping items or units from a given original 
sequence. The items can be characters, letters, words or anything else. The idea 
of the method is to count how many times each sub-sequence occurs. This is the 
transformation from sequential to numerical space. 

Here is an example of N-gram analysis application for two words: ‘performance’ 
and ‘performer’, N = 2, and a single unit is a character. The resulting frequency 
matrix after N-gram processing is shown in Table 


5.2 Dimensionality Reduction with Minor Component Analysis 

Dimensionality reduction is applied to convert high- dimensional data to a smaller 
set of derived variables. In the presented study Minor Component Analysis (MCA) 
method is applied [l^. This algorithm has been selected selected on the basis of 
comparison with other dimensionality reduction methods such as Principal Com¬ 
ponent Analysis (PCA) [13] and diffusion maps |22[ . MCA extracts components of 
covariance matrix of the input data set and uses minor components (eigenvectors 
with the smallest eigenvalues of covariance matrix). 6 minor components are used 
as a basis of the embedded space. This number is defined by means of Second 
ORder sTatistic of the Eigenvalues (SORTE) method [381139| . 


5.3 Pattern Recognition: K-NN Anomaly Score Curlier Detection 

In order to extract abnormal instances from the testing dataset K-NN anomaly 
outlier score algorithm is applied. In contrast with K-NN classification, method is 
not supervised, but semi- supervised, as the training data does not contain any 
abnormal labels. In general, there are two approaches concerning the implemen¬ 
tation of this algorithm; anomaly score assigned to each point is either the sum 
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Table 5 Parameters of algorithms in sleeping cell detection framework 


Parameter 

Value 

Number of chunks in K-fold method per dataset 

6 

Sliding window size 

15 

Sliding window step 

10 

N in N-gram algorithm 

2 

Number of nearest neighbors (k) in K-NN algorithm 

35 

Number of minor components 

6 


of distances to k nearest neighbors [3] or distance to k-th neighbor [58]. The first 
method is employed in the presented sleeping cell detection framework, as it is 
more statistically robust. Thus, the algorithm assigns an anomaly score to every 
sample in the analyzed data based on the sum of distances to k nearest neighbors 
in the embedded space. Euclidean metric is applied as similarity measure. Points 
with the largest anomaly scores are called outliers. Separation to normal and ab¬ 
normal classes is defined by threshold parameter T, equal to 95*^ percentile of 
anomaly scores in the training data. 

Configuration parameters of data analysis algorithms in the presented sleeping 
cell detection framework are summarized in Table |5| 


5.4 Pattern Evaluation 

The main goal of pattern evaluation is conversion of output information from 
K-NN anomaly score algorithm to knowledge about location of the network mal¬ 
function, i.e. RACK sleeping cell. This is achieved with post-processing of the 
anomalous data samples through analysis of their correspondence to particular 
network elements, such as UEs and cells. 4 post-processing methods are developed 
for this purpose. The essence of these methods, discussed throughout this section, 
is reflected in their names. The first part describes which geo-location information 
is used for mapping data samples to cells, e.g. dominance map information, tar¬ 
get or serving cell ID. The second part denotes what is used as feature space for 
post-processing. It can be either “sub-calls”, when rows of the dataset are used as 
features or “2-gram”, when individual event pair combinations, i.e. columns of the 
dataset are used as features. The last, third part of the method name describes 
is analysis considers the difference between training and testing data (“deviation” 
keyword), or whether only information about testing set is used to build sleeping 
cell detection histogram. 

Output from the post-processing methods described above is a set of values - 
sleeping cell scores, which correspond to each cell in the analyzed network. High 
value of this score means higher abnormality, and hence probability of failure. To 
achieve clearer indication of problematic cell presence, additional non-linear trans¬ 
formation is applied. It is called amplification, as it allows to emphasize problem¬ 
atic areas in the sleeping cell histogram. Sleeping cell score of each cell is divided 
by the sum of SC scores of all non-neighboring cells. Sleeping cell scores, received 
after post-processing and amplification are then normalized by the cumulative SC 
score of all cells in the network. Normalization is necessary to get rid of dependency 
on the size of the dataset, i.e. number of calls and users. 
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5.5 Knowledge Interpretation and Presentation 

The final step of the data analysis framework is visualization of the fault detection 
results. It is done with construction of a sleeping cell detection histogram and 
network heat map. However, sleeping cell histogram does not show how cells are 
related to each other: are they neighbors or not, and which area of the network 
is causing problems. Heat map method shows more anomalous network regions 
with darker and larger spots, while normally operating regions are in light grey 
color. The main benefit of network heat map is that mobile network topology and 
neighbor relations between cells are illustrated. 

5.5.1 Performance Evaluation 

To apply data mining performance evaluation metrics labels of data points must 
be know. Cell is labeled as abnormal if its SC score deviates more than 3cr (stan¬ 
dard deviation of sleeping cell scores) from the mean SC of score in the network. 
Mean value and standard deviation of the sleeping cell scores are calculated alto¬ 
gether from 72 runs produced by K-fold method for “normal”-“problematic”, and 
“normal”-“reference” dataset pairs. Availability of the labels and the outcomes of 
different post-processing methods enables application of such data mining perfor¬ 
mance metrics as accuracy, precision, recall, F-score, True Negative Rate (TNR) 
and False Positive Rate (FPR) [32]. On the basis of these scores ROC curves are 
plotted. 

In addition to the conventional performance evaluation metrics described above, 
a heuristic method is applied to complement the analysis. This approach measures 
how far is the achieved performance from the a priori known ideal solution. Per¬ 
formance of the sleeping cell detection algorithm can be described by a point in 
the space “sleeping cell magnitude”-“cumulative standard deviation”. “Sleeping cell 
magnitude” is the highest sleeping cell score, and a sum of all sleeping cell scores is 
“cumulative standard deviation”. This plane contains two points of interest: in case 
of malfunctioning network, the ideal sleeping cell detection algorithm would have 
coordinate [0;100]. In case of error-free network, the ideal performance is point 
[0; 100/Nceiis in the network]- Thus, the Smaller the Euclidean distance between the 
achieved and ideal sleeping cell histograms, the better the performance of the 
sleeping cell detection algorithm. 


6 Results of Sleeping Cell Detection 

This section presents the results of sleeping cell detection for different post-processing 
algorithms. In addition, the data at different stages of the detection process is 
illustrated. Then performance metrics are used to compare effectiveness of the 
developed SC identification algorithms. 


6.1 Pre-processing and K-NN Anomaly Score Calculations 

After pre-processing with sliding window and N-gram methods, and transforma¬ 
tion with MCA, training MDT data is processed with K-NN anomaly score al¬ 
gorithm. As it is discussed in section |5.3[ the anomaly score threshold, used for 
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(a) Normal training dataset in the 
embedded space. 
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(b) Sorted outlier scores of normal 
training dataset. 


Fig. 3 Normal dataset used for training of the sleeping cell detection framework 


separation of data points to normal and abnormal classes, is selected to be 95*^ 
percentile of outlier score in training data. Shape of normal training dataset in the 
embedded space is shown in Fig. |3a[ and sorted anomaly outlier scores are pre¬ 
sented in Fig. |3b[ It can be seen that data points are very compact in the embedded 
space, and because of that there is no big difference in the anomaly score values. 
The main goals of analyzing testing dataset are to hnd anomalies, detect sleeping 
cell, and keep the false alarm rate as low as possible. At the testing phase either 
problematic or reference data are analyzed. After the same pre-processing stages 
as for training, the testing data is represented in the embedded space. When test¬ 
ing data is problematic dataset some of the samples are significantly further away 
from the main dense group of points, Fig.[^ These abnormal points are labeled as 
outliers, and the corresponding anomaly scores for these samples are much higher, 
as it can seen from Fig. |4b[ On the other hand, some of the points with relatively 
low anomaly score are above the abnormality threshold. This means that there 
is still certain percentage of false alarms, i.e. some “good” points are treated as 
“bad”. The extent of negative effect caused by false alarms is discussed further in 
Section |6.4[ Though, there is no opposite behavior referred to as “miss-detection” 
- none of the anomalous points are treated as normal. 

Validation of the data mining framework is done by using error-free reference 
dataset as testing data. No real anomalies are present in the network behavior. 
Reference testing data in the embedded space and corresponding anomaly outlier 
scores are shown in Fig. Only few points can be treated as outliers, and in 
general the shapes of normal (Fig. 3aI and reference (Fig. Sal datasets in the 
embedded space are very similar. Anomaly outlier scores of the reference testing 
data is low for all points, except 2 outliers. 


6.2 Application of Post-Processing Methods for Sleeping Cell Detection 

After training and testing phases certain sub-calls are marked as anomalies. The 
next step is conversion of this information to knowledge about location of malfunc¬ 
tioning cell or cells, and this is done through post-processing described in Section 

[EH 
















14 


Fedor Chernogorov et al. 


: o Normal Sub-calls 
X Abnormal Sub-calls 



10° 




O Normal Sub-calls 

X Abnormal Sub-calls 
-95% Threshold 



_ Jf 



0 1000 2000 3000 4000 5000 6000 7000 


Sub-call 


(a) Problem testing dataset in the embedded 
space 
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Fig. 4 Problematic dataset used at the testing phase of the sleeping cell detection framework 
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(a) Reference testing dataset in the 
embedded space 


(b) Sorted outlier scores of reference 
testing dataset 


Fig. 5 Reference dataset used at the testing phase of the sleeping cell detection framework 


6.2.1 Detection based on Dominance Cell Sub-Call Deviation 

In our earlier study [13] post-processing based on dominance cells and call devia¬ 
tion for sleeping cell detection is presented. One problem of using calls as samples 
is that, in case if the duration of the analyzed user call is long, the corresponding 
number of visited cells is large, especially for fast UEs. Hence, even if certain call 
is classified as abnormal, it is very hard to say which cell has anomalous behav¬ 
ior. To overcome this problem, analysis is done for sub-calls, derived with sliding 
window method, see Section [5.1.1| Majority of sub-calls contain the same number 
of network events, and the length of the analyzed sequence is short enough to 
identify the exact cell, with problematic behavior. Deviation measures the differ¬ 
ence between training and testing data, and it is used to sleeping cell detection 
histogram, presented in Fig. From this figure, it can be seen that abnormal 
sub-calls are encountered more frequently in the area of dominance of cell 1, which 
has the highest deviation. One can see that there are 2 types of bars - colorful (in 
this case blue) and grey. The second variant implies additional post-processing 
step - amplification, described in Section [5.4[ In addition to cell I, its neighboring 
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(a) Problematic dataset sleeping cell detection (b) Reference dataset sleeping cell detection 
histogram histogram 
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(d) Reference dataset heat map 


Fig. 6 Results of sleeping cell detection for Dominance Cell Sub-Call Deviation method 


cells 8, 9, 11 and 12 also have increased deviation values, as it can be seen from 
the network heat map in Fig. Sleeping cell detection histogram and network 
heat map for reference dataset used as testing are shown in Fig. |6b| and |6d| cor¬ 
respondingly. Even though cells 6 and 17 have higher SC scores than other cells, 
they are not marked as abnormal, because their abnormality does not reach mean 
+ 3cr level. 

6.2.2 Detection based on Dominance Cell 2-Gram Deviation 

In this method problematic network regions are found through comparison of oc¬ 
currence frequencies, normalized by the total number of users, in training and 
testing datasets. In case there is a big increase or decrease, the cell associated 
with these changes is marked as abnormal. From sleeping cell detection histogram 
in Fig. it can be that cell 1 has a clear difference in number of 2-gram oc¬ 
currences in testing data, if compared to training data. This happens because 
handovers toward this cell fail. Due to this fact 2-gram sequence with events re¬ 
lated to handovers become imbalanced in testing data if compared to training 
data. For instance, 2-grams like Handover (HO) Command - HO Complete and 
HO Complete - A2 RSRP ENTER, become very rare. On the other hand, 2- gram 
HO Command - A2 RSRP ENTER, which can be treated as indication of non- 
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Cell ID Cell ID 

(a) Problematic dataset sleeping cell detection (b) Reference dataset sleeping cell detection 
histogram histogram 
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(c) Problematic dataset heat map 
Fig. 7 Results of sleeping cell detection for Dominance Cell 2-Gram Deviation method 
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successful handovers, in opposite becomes very popular in testing data, while in 
training data it does not exist at all. Among the neighbors of problematic cell 
1, only cell 11 has slightly increased sleeping cell score. Testing sleeping cell de¬ 
tection framework with reference data and post-processing with Dominance Cell 
2-Gram Deviation method demonstrates lower false-alarm rate than Dominance 
Cell Sub-Call Deviation, as it can be seen from Fig. |7b| and [7d| 


6.2.3 Detection based on Dominance Cell 2-Gram Symmetry Deviation 

This post-processing method analyzes the symmetry imbalance of network events 
2-grams. Information about number of 2-gram directed to the cell, and from the cell 
is extracted from the training. In case if in the testing data the balance (number of 
2-grams, which start in this cell, )Thus, only 2-grams, which occur at cell borders, 
i.e. in the dominance area of 2 cells, are considered. It means that if in the training 
data, the number of handovers from Cell A to Cell B, and from Cell B to Cell A, 
is roughly the same, this cell has balanced 2-gram it can concluded that symmetry 
of this particular 2-gram is skewed, disturbed comparing to the training data. 
Most common types of 2-grams which are analyzed with this method are related 
to handovers, e.g. A3 - HO COMMAND sequences. 
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(a) Problematic dataset sleeping cell detection (b) Reference dataset sleeping cell detection 
histogram histogram 
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(c) Problematic dataset heat map 

Fig. 8 Results of sleeping cell detection for Dominance Cell 2-Gram Symmetry Deviation 
method 


From Fig. it can be seen that Dominance Cell 2-Gram Symmetry Deviation 
finds sleeping cell 1, while its neighboring cells 8, 9, 11 and 12 have suspiciously 
high sleeping cell score, if compared to other cells in the network. 

Comparison of symmetry analysis method with two previously described post¬ 
processing approaches shows that this method is very efficient in detecting sleeping 
cell and its neighbors. At the same time stability, i.e. false alarm rate, of this 
method is also very good, as it can be seen from Fig. |8b[ 


6 . 2.4 Detection based on Target Cell Sub-Calls 

As it is discussed in Section |5.4| deviation between training and testing data 
is not calculated in this method. Extensive location information, like dominance 
map information, is not required for sleeping cell detection with target cell sub-call 
method. The sleeping cell detection histogram, presented in Fig. is constructed 
by counting all unique target cell IDs for each anomalous sub-call. It can be clearly 
seen that cell 1 is successfully detected. Neighboring cells 8, 9, 11 and 12 also 
contain indication of malfunction in this area, as it can be noticed from heat map, 
shown in Fig. |9b[ For this method, the SC score of cell 1 is slightly lower than for the 
post-processing methods, based on dominance cell deviation. Another shortcoming 
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Cell ID Cell ID 

(a) Problematic dataset sleeping cell detection (b) Reference dataset sleeping cell detection 
histogram histogram 
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(c) Problematic dataset heat map (d) Reference dataset heat map 

Fig. 9 Results of sleeping cell detection for Target Cell Sub-Calls method 



is that target cell sub-call method is more prone to trigger false alarms. This can 
be seen from the results when reference data is used as testing, Fig. |9b[ Sleeping 
cell score of cell 6 is reaching threshold of mean plus 2 standard deviations. For 
cells 16 and 17 SC scores are also quite high, as it can also be noticed from Fig. 
|9d[ On the other hand, target cell sub-call method is much simpler, and requires 
signihcantly less information about user event occurrence location. 


6.3 Combined Method of Sleeping Cell Detection 

The idea of this method is to create a cumulative sleeping cell detection histogram 
based on the results from all 4 post-processing methods described above. The 
resulting amplihed SC histogram is shown in Fig.[^ Cell 1 has sleeping cell score 
well over + 3 * a threshold. Neighboring cells 8, 9, 11, 12 also have increased 
sleeping cell scores comparing to other cells. Reference data used as testing also 
demonstrates stability of the combined approach - no false alarms are triggered. 
Though, it can be seen that usage of target cell sub-call method introduces some 
noise. It is important to note that post-processing methods are applied with equal 
weights. However, it is possible to emphasize more accurate method by increasing 
its weight, and penalize the unreliable, by reducing its weight. Though, selection 
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(b) Reference dataset sleeping cell 
detection histogram 
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(c) Problematic dataset heat map (d) Reference dataset heat map 

Fig. 10 Results of sleeping cell detection for amplified combined method 



of optimal weights is a matter of a separate study and is not discussed in this 
article. 


6.4 Comparison of Algorithms and Performance Evaluation 

The post-processing methods discussed above have their own advantages and dis¬ 
advantages. Traditional data mining metrics, discussed in Section [5.5.1| are applied 
for quantitative comparison of sleeping cell detection methods, Fig. |lla| Ideal per¬ 
formance is presented with the solid double black line, and corresponds to the 
maximum area of the hexagon. Formally, according to the values of the met¬ 
rics, Dominance Cell 2-gram Deviation and Dominance Cell Sub-call Deviation 
methods, demonstrate better performance than other post-processing techniques. 
However, high false positive rate for Dominance Cell 2-gram Symmetry Deviation 
and Target Cell Sub-call methods does not necessarily mean that these methods 
are worse. The reason is that neighboring cells of cell 1 exceed the 3cr thresh¬ 
old. This happens because adjacent cells are not completely independent, and are 
affected by malfunction in one of the neighbors. Thus, Dominance Cell 2-gram 
Symmetry Deviation and Target Cell Sub-call methods can be treated as more 
sensitive than the others. The observed behavior emphasizes that amplification 
should be complemented by some other ways to to take network topology into 
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(b) ROC curve of sleeping cell detection framework 
Fig. 11 Performance measures for comparison of sleeping cell detection algorithms 


account. However this is a subject for further study. ROC curve of of the designed 
sleeping cell detection algorithm is presented in Fig. |llb[ The proposed framework 
is able to create such a projection of the MDT data, that in the new space normal 
data and anomalous data points are fully separable and do not overlap. Hence, 
the suggested data mining framework for sleeping cell detection is successful, and 
for reduction of false alarm rate it is necessary to invent a better separation rule, 
than 3 ct deviation from mean SC score. 
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Fig. 12 Heuristic performance comparison of algorithms 


Another method for comparison of post-processing algorithms is a henristic 
approach described in Section According to this method, more accnrate 

post-processing algorithm is the one, which has the smallest distance to the ideal 
solution point for either problematic or error-free case. Cumulative distances for 
different algorithms in non-amplified and amplified cases are presented in Fig. 
|12a| and Fig. |12b| correspondingly. It can be seen that Dominance Cell 2-Gram 
Symmetry Deviation method has the smallest distance from the ideal detection 
case. Thus, from perspective of the heuristic performance evaluation approach this 
method outperforms other post-processing methods. 


7 Conclusions 

This article presents a novel sleeping cell detection framework based on knowledge 
mining paradigm. MDT reports are used for the detection of a random access 
channel malfunction in one of the network cells. Experimental setup implements 
a simulated LTE network, used to generate a diverse statistics base with several 
thousands of user calls and tens of thousands of MDT samples. Investigated type 
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of sleeping cell problem is rather complex, and detection of this problem has never 
been studied before. 

The designed knowledge mining framework is semi-supervised and has central¬ 
ized architecture from perspective of self-organizing networks. The heart of the 
developed detection framework is the analysis of sequences with N-gram method in 
the series of user event-triggered measurement MDT reports. Data pre-processing 
with sliding window transformation method allows to make the statistics base 
more reliable through standardization of the input event sequences. 2-gram analy¬ 
sis is used to convert sequential data to numeric format in the new feature space. 
To simplify analysis of the data in the new space, dimensionality reduction with 
minor component analysis method is applied. K-NN anomaly score detection algo¬ 
rithm is used to hnd the outliers in the data and using this information, anomalous 
data points are converted with post-processing to the knowledge about location 
of the problematic regions in the network. Comparison of different location map¬ 
ping post-processing methods is done, additionally, so called amplification is used 
to take into account neighbor relations between cells and network topology, for 
improvement of sleeping cell detection performance. 

Results demonstrate, that the suggested framework allows for efficient detec¬ 
tion of the random access sleeping cell problem in the network. Evaluation shows 
that post-processing method named Dominance Cell 2-Gram Symmetry Devia¬ 
tion demonstrates the best combination of performance results. Amplihcation also 
proves to be the very efficient approach for improvement of the detection quality. 
Results of this work lay grounds and suggest exact methods for building advanced 
performance monitoring systems in modern mobile networks. One of the possible 
directions in this area is extensive usage of data mining techniques in general, and 
anomaly detection in particular. New systems of network maintenance would allow 
to address growing complexity and heterogeneity of modern mobile networks, and 
especially 5*^ Generation (5G). 

Future work in this field includes validation of the developed system in more 
complex scenarios, detection of several or different types of malfunctions, and 
substitution of semi-supervised approach with unsupervised. The ultimate goal is 
to achieve accurate and timely detection of different sleeping cell types in highly 
dynamic mobile network environments. Obviously, low level of false alarms must be 
supported, and at the same time significant increase of computational complexity 
should be avoided. 
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